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CYTOMEGALOVIRUS TNTRON A FRAGMENTS 

Cross-Reference to Related Applications 
This application is related to provisional patent application serial no. 60/240,502 
filed October 13, 2000, from which application priority is claimed under 35 USC 
§1 19(e)(1) and which application is incorporated herein by reference in its entirety. 

Technical Field 

The present invention relates generally to recombinant gene expression systems. 
More particularly, the invention relates to novel cytomegalovirus (CMV) Intron A 
fragments for use in expression constructs for expressing gene products, and methods of 
using the same. 

Background Of The Invention 

Proteins are conveniently produced in a variety of procaryotic and eucaryotic 
recombinant expression systems. For example, Eschericia cofi-derived plasmid DNA 
vectors are widely used to express proteins both in vitro and in vivo. In vitro, such 
vectors are used for purposes ranging from e.g., preliminary evaluation of the nature of 
protein expression to large-scale manufacture of recombinant proteins. In vivo, DNA 
vectors are used, for example, for gene therapy and nucleic acid vaccination. 

In general, effective vectors are those that express high levels of protein due to the 
use of efficient promoters and other control elements. Other factors that may contribute 
to efficient transfection of cells include: (1) uptake of plasmid by cells; (2) escape of 
plasmid from endocytic vesicles after endocytosis; (3) translocation of the plasmid from 
the cytoplasm into the nucleus; and (4) transcription of the plasmid in the nucleus. 
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Work from several laboratories suggests that a major barrier to efficient 
transfection is translocation of the plasmid into the nucleus, particularly in cells that do 
not undergo mitosis (e.g., myocytes). One parameter that may affect this step is the size 
of the plasmid, as the nuclear pore complex involved in uptake of macromolecules into 
5 the nucleus has a finite size. Hence, it is desirable to engineer small plasmids that retain 
the ability to express proteins at high levels. This has the potential to facilitate DNA 
delivery and allows the insertion of larger gene inserts than is feasible in larger plasmids. 
The latter point is particularly important for preparation of certain recombinant viral 
vectors that have a limited capacity to package plasmids, such as alphavirus and adeno- 

1 0 associated vectors. 

One particularly effective system for the production of recombinant proteins 
employs vectors containing the human cytomegalovirus (hCMV) immediate-early (IE1) 
enhancer/promoter region which controls transcription of the immediate-early 72,000 
molecular weight protein of hCMV. See, e.g., Chapman et al, Nuc. Acids Res. (1991) 

1 5 19:3979-3986; and U.S. Patent No. 5,688,688. The hCMV IE1 enhancer/promoter is one 
of the strongest enhancer/promoters known and is active in a broad range of cell types. 

The hCMV IE1 enhancer/promoter region (Figure 2) includes a tissue-specific 
modulator, multiple potential binding sites for several different transcription factors, and 
a complex enhancer. The transcribed region of the gene contains four exons and three 

20 introns. The largest of the introns, termed "Intron A," is found within the S'-untranslated 
region of the gene. See, e.g., Chapman et al, Nuc. Acids Res. (1991) 19:3979-3986 for 
the sequence and structure of this region in hCMV strain Towne, and Akrigg et aL, Virus 
Res. (1985) 2:107-121, for a description of the corresponding region in hCMV strain 
AD169. The Intron A region of the hCMV IE1 enhancer/promoter has been shown to 

25 contain elements that enhance expression of heterologous proteins in mammalian cells. 
See, e.g., Chapman et al., Nuc. Acids Res. (1991) 19:3979-3986. 

Introns are non-coding regions present in most pre-mRNA transcripts produced in 
the mammalian cell nucleus. Intron sequences can profoundly enhance gene expression 
when included in heterologous expression vectors. See, e.g., Buchman et al., Molec. Cell 

30 Biol (1988) 8:4395-4405; Chapman et al., Nuc. Acids Res. (1991) 19:3979-3986. Recent 
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studies have demonstrated a connection between pre-mRNA splicing and export from the 
nucleus of mature mRNAs to the cytoplasm. Cullen, B.R., Proc. Natl Acad. Sci. USA 
(2000) 97:4-6; and Luo et al., Proc. Natl. Acad. Sci. USA (1999) 96:14937-14942. 
Accordingly, increased levels of expression, such as those seen with the Intron A region 
5 of the hCMV IE1 enhancer/promoter, may be due to increased levels of translatable 
mRNAs in the cytoplasm. 

Summary of the Invention 

Accordingly, the present invention provides CMV Intron A fragments for use in 

1 0 expression constructs. The fragments retain the ability to enhance expression levels when 
present in such expression constructs. The use of Intron A fragments is desirable, 
especially when used in recombinant viral vectors with size constraints for packaging 
plasmids, such as alphavirus and adeno-associated vectors. Thus, the present invention 
provides a highly efficient expression system for the production of recombinant proteins 

15 in therapeutically useful quantities, both in vitro and in vivo. 

Accordingly, in one embodiment, the subject invention is directed to an hCMV 
Intron A fragment, wherein the fragment lacks the full-length Intron A sequence and 
comprises: (a) a sequence of nucleotides having at least about 75% sequence identity to 
the contiguous sequence of nucleotides found at positions 1-25, inclusive, of Figure 1A, 

20 and (b) a sequence of nucleotides having at least about 75% sequence identity to the 

contiguous sequence of nucleotides found at positions 775-820, inclusive, of Figure 1A. 
Further, when the fragment is present in an expression construct, the expression construct 
achieves expression levels greater than those levels achieved by a corresponding 
construct that completely lacks an Intron A sequence. In certain embodiments, the 

25 expression levels achieved are at least two-fold, or at least ten-fold, or at least fifty-fold 
greater than those levels achieved by a corresponding construct that completely lacks an 
Intron A sequence. 

In another embodiment, the invention is directed to an Intron A fragment that 
comprises: (a) a sequence of nucleotides having at least about 75% sequence identity to 
30 the contiguous sequence of nucleotides found at positions 1-51, inclusive, of Figure 1 A, 
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and (b) a sequence of nucleotides having at least about 75% sequence identity to the 
contiguous sequence of nucleotides found at positions 741-820, inclusive, of Figure 1A, 
wherein when the fragment is present in an expression construct, the expression construct 
achieves expression levels greater than those levels achieved by a corresponding 
5 construct that completely lacks an Intron A sequence. In certain embodiments, the 

expression levels achieved are at least two-fold, or at least ten-fold, or at least fifty-fold 
greater than those levels achieved by a corresponding construct that completely lacks an 
Intron A sequence. 

In another embodiment, the Intron A fragment comprises the sequence of 
10 nucleotides 1-51, inclusive, of Figure 1A, linked to nucleotides 741-820, inclusive, of 
Figure 1A. 

In still a further embodiment, the Intron A fragment comprises the Intron A 
nucleotide sequence depicted in Figure 1C, or a nucleotide sequence with at least about 
75% sequence identity thereto. 
15 In another embodiment, the Intron A fragment consists of the Intron A nucleotide 

sequence depicted in Figure 1C. 

In yet another embodiment, the invention is directed to an hCMV Intron A 
fragment, wherein the fragment lacks the full-length Intron A sequence and comprises: 

(a) a sequence of nucleotides having at least about 75% sequence identity to the 

20 contiguous sequence of nucleotides found at positions 1-25, inclusive, of Figure 1 A, and 

(b) a sequence of nucleotides having at least about 75% sequence identity to the 
contiguous sequence of nucleotides found at positions 775-820, inclusive, of Figure 1 A, 
wherein when the fragment is present in an expression construct, the expression construct 
achieves expression levels equal to, or greater than, those levels achieved by an 

25 expression construct that includes a corresponding intact, full-length Intron A sequence. 

In another embodiment, the invention is directed to an hCMV Intron A fragment, 
wherein the fragment lacks the full-length Intron A sequence and comprises: (a) a 
sequence of nucleotides having at least about 75% sequence identity to the contiguous 
sequence of nucleotides found at positions 1-51, inclusive, of Figure 1A, and (b) a 

30 sequence of nucleotides having at least about 75% sequence identity to the contiguous 
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sequence of nucleotides found at positions 741-820, inclusive, of Figure 1A, wherein 
when the fragment is present in an expression construct, the expression construct 
achieves expression levels equal to, or greater than, those levels achieved by an 
expression construct that includes a corresponding intact, full-length Intron A sequence. 
5 In further embodiments, the invention is directed to recombinant expression 

constructs comprising (a) a coding sequence; and (b) control elements that are operably 
linked to the coding sequence, wherein the control elements comprise the Intron A 
fragment described herein, whereby the coding sequence can be transcribed and translated 
in a host cell. In certain embodiments, the control elements further comprise a promoter 

10 selected from the group consisting of an SV40 early promoter, a CMV promoter, a mouse 
mammary tumor virus LTR promoter, an adenovirus major late promoter, an RSV 
promoter, a SRa promoter, and a herpes simplex virus promoter. Particularly, the control 
elements may comprise the hCMV immediate-early (IE1) enhancer/promoter region 
found at nucleotide positions 460 to 1264 of Figure 2, and Exon 2 of the 5'-UTR 

15 comprising the sequence of nucleotides depicted at positions 821-834, inclusive, of 

Figure 1 A. Host cells comprising the expression constructs and methods of producing a 
recombinant polypeptide are also provided. 

In another embodiment, the invention is directed to a polynucleotide comprising 
the sequence depicted in Figure 5B. 

20 These and other aspects of the present invention will become evident upon 

reference to the following detailed description and attached drawings. In addition, 
various references are set forth herein which describe in more detail certain procedures or 
compositions, and are therefore incorporated by reference in their entirety. 

25 Brief Description of the Drawings 

Figure 1A (SEQ ID NO:l) shows the sequence of a representative CMV IE1 
Intron A from hCMV strain Towne. Also shown in Figure 1A is the portion of the 
sequence deleted from deletion mutant pCON3. The splice donor sequence is bolded and 
shown with an arrow. The splice acceptor sequence is underlined and designated with an 

30 arrow. Possible branch points are indicated. 
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Figure IB (SEQ ID N0:2) shows the oligonucleotide corresponding to the 
retained 3'-portion of the deleted Intron A construct of pCON3 as compared with the 3 ! - 
portion of wild-type Intron A. 

Figure 1C (SEQ ID NO: 3) shows the Intron A sequence of deletion mutant 
5 pCON3. 

Figure 2 (SEQ ID NO:4) (GenBank accession number M60321 and Chapman et 
al., Nuc. Acids Res. (1991) 19:3979-3986) shows the nucleotide sequence of the 5' region 
of the major immediate-early gene of hCMV, including the enhancer/promoter region. 
The enhancer region (nucleotides -600 to -1081), the Pol II promoter (nucleotides 1081- 
10 1 143), Exon 1 of the 5' UTR (nucleotides 1 144-1264), Intron A (nucleotides 1265-2088) 
and Exon 2 of the 5 ? UTR (nucleotides 2089-2096) are shown. The TATAA and CAAT 
boxes, as well as the start codon, are boxed. 

Figure 3 shows various Intron A deletion mutants as described in the examples. 

Figure 4 depicts normalized luciferase expression by the various deletion mutants 
1 5 shown in Figures 1 C and Figure 3 . 

Figure 5A (SEQ ID NO:5) shows the wild-type rabbit p-globin gene sequence 
used in the examples. 

Figure 5B (SEQ ID NO:6) shows the optimized rabbit P-globin gene sequence 
used in the examples. 

20 Figure 6 shows luciferase expression as a measure of p55gag expression by parent 

vector, pCMVkm-Luciferase, as compared to RpG-IVSI (containing the wild-type rabbit 
P-globin gene sequence shown in Figure 4A) and RpG-OPTI (containing the optimized 
rabbit p-globin gene sequence shown in Figure 4B) 

Figure 7 depicts anti-p55gag titers from mice immunized with various constructs 

25 including the Intron A fragment, as described in the examples. 

Detailed Description of the Invention 

The practice of the present invention will employ, unless otherwise indicated, 
conventional methods of chemistry, biochemistry, recombinant DNA techniques and 
30 immunology, within the skill of the art. Such techniques are explained fully in the 
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literature. See, e.g., Sambrook, et al., Molecular Cloning: A Laboratory Manual (2nd 
Edition); Methods In Enzymology (S. Colowick and N. Kaplan eds. 5 Academic Press, 
Inc.); DNA Cloning, Vols. I and II (D.N. Glover ed.); Oligonucleotide Synthesis (M.J. 
Gait ed.); Nucleic Acid Hybridization (B.D. Hames & S J. Higgins eds.); Animal Cell 
5 Culture (R.K. Freshney ed.); Perbal, B., A Practical Guide to Molecular Cloning. 

All publications, patents and patent applications cited herein, whether supra or 
infra, are hereby incorporated by reference in their entirety. 

It must be noted that, as used in this specification and the appended claims, the 
singular forms "a", "an" and "the" include plural referents unless the content clearly 
10 dictates otherwise. Thus, for example, reference to "an antigen" includes a mixture of 
two or more antigens, and the like. 

The following amino acid abbreviations are used throughout the text: 
Alanine: Ala (A) Arginine: Arg (R) 

Asparagine: Asn (N) Aspartic acid: Asp (D) 

15 Cysteine: Cys (C) Glutamine: Gin (Q) 

Glutamic acid: Glu (E) Glycine: Gly (G) 
Histidine: His (H) Isoleucine: He (I) 

Leucine: Leu (L) Lysine: Lys (K) 

Methionine: Met (M) Phenylalanine: Phe (F) 

20 Proline: Pro (P) Serine: Ser (S) 

Threonine: Thr (T) Tryptophan: Trp (W) 

Tyrosine: Tyr (Y) Valine: Val (V) 

I Definitions 

In describing the present invention, the following terms will be employed, and are 
25 intended to be defined as indicated below. 

By "Intron A fragment" is meant a fragment derived from an Intron A sequence of 
a CMV immediate-early enhancer/promoter region, which does not include the entire 
Intron A sequence. A representative hCMV enhancer/promoter region is shown in Figure 
2. The intact Intron A sequence is represented by the lowercase nucleotides spanning 
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positions 1265-2088 of Figure 2 and nucleotides 1-820 of Figure 1 A. The Intron A 
fragment of the present invention comprises a deletion from the full-length sequence, 
which deletion may be internal or occur at the 5 ! - and/or 3-ends of the Intron A region, so 
long as the region still functions to permit authentic splicing in the nucleus of primary 
5 transcripts that include the Intron A fragment. Preferably, an "Intron A fragment" 
includes the minimum number of bases or elements necessary to achieve expression 
levels over those achieved in corresponding constructs that completely lack an Intron A 
sequence. More preferably, expression levels achieved by constructs that include the 
Intron A fragment of the invention are at least two-fold over those levels achieved 

10 without the presence of the Intron A region, preferably at least ten-fold greater, most 
preferably at least twenty- to fifty-fold greater, or more, than those levels achieved 
without the Intron A region. Preferably, expression levels are at least equal to, or greater 
than, for example at least two-fold greater than, those levels achieved when the intact, 
full-length Intron A sequence is present in a corresponding expression construct. Such 

1 5 comparisons are typically made by making expression constructs that include all elements 
of the test construct, but either completely lack the Intron A sequence, or include the full- 
length Intron A sequence (see the Examples herein). 

Thus, an "Intron A fragment 5 ' of the present invention will generally include at 
least the 5' splice junction sequence (nucleotides 1-7 as shown in Figure 1A), usually at 

20 least up to the first 25 5-nucleotides of the Intron A region (nucleotides 1-25 of Figure 
1 A), more preferably at least up to the first 30 nucleotides of the Intron A region 
(nucleotides 1-30 of Figure 1A), even more preferably at least up to the first 40 
nucleotides of the Intron A region (nucleotides 1-40 of Figure 1A), more preferably at 
least up to the first 51 nucleotides of the Intron A region (nucleotides 1-51 of Figure 1A), 

25 and even up to the first 75 or more nucleotides of the Intron A region, and any integer 
between these values, or even more of the 5 ! -region of Intron A. 

Moreover, in addition to the 5'-sequence described above, an "Intron A fragment" 
will optionally include at least the 3' splice junction sequence (nucleotides 815-820 of 
Figure 1 A). Generally, the Intron A fragment will include at least up to the 25 3- 

30 nucleotides of the Intron A sequence shown in Figure 1 A (nucleotides 796-820 of Figure 
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1A), preferably up to the 50 3 '-nucleotides of the sequence shown in Figure 1 A 
(nucleotides 771-820 of Figure 1A), more generally up to the 70 3 '-nucleotides 
(nucleotides 751-820 of Figure 1A), preferably at least up to the 80 3 '-nucleotides 
(nucleotides 741-820 of Figure 1A), or even more of the 3'-region, such as the 100-150 
5 3 ? -nucleotides, and any integer between these values, or more of the 3 f -region of Intron A. 

Thus, it is apparent that an Intron A fragment according to the present invention 
may include a variety of internal deletions, such as about 10 to about 750 or more 
nucleotides of the Intron A sequence, preferably about 25 to about 700 or more 
nucleotides, more preferably about 50 to about 700 nucleotides, and most preferably 

10 about 500 to about 680-690 or more nucleotides, or any integer between the above 
ranges, so long as an expression construct including the Intron A fragment either 
enhances expression relative to a corresponding construct completely lacking an Intron A 
sequence, or provides equivalent or enhanced expression relative to a corresponding 
construct which includes the entire Intron A sequence, as described above. 

15 The retained 5 f - and 3 ! -regions of the Intron A fragment of the present invention 

may be directly linked to one another, e.g., as shown in Figure 1 A, or the 5 ! - and 3'- 
regions of the Intron A fragment may be linked together via a linker sequence. The linker 
sequence may comprise from 1 up to about 400 or more nucleotides, or any integer 
between these values, and may comprise regions for particular transcript factors, such as 

20 NF1 binding sites, tissue-specific enhancer sequences, such as muscle-specific enhancers, 
and the like. 

One representative Intron A fragment sequence comprises the sequence of 
nucleotides at positions 1-51 linked to nucleotides 741-820, of Figure 1A, thus 
comprising an internal deletion of nucleotides 52-740, as shown in Figure 1 A. Also 
25 included in this construct is Exon 2 of the 5 T UTR of the hCMV enhancer/promoter 
region, nucleotides 821-834 of Figure 1A. 

An "Intron A fragment" as used herein, encompasses sequences with identity to 
an Intron A fragment isolated from any of the various hCMV strains, such as for example 
hCMV strain Towne and hCMV strain AD 169, as well as polynucleotides that are 
30 substantially homologous to the reference molecule (as defined below) and which still 
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function as described above. Thus, for example, the fragment shown in Figure 1C 
includes nucleotide substitutions at the branch points and in the polypyrimidine tract to 
conform these sequences to consensus sequences, as shown in Figures IB and 1C. 
Preferably, but not necessarily, the branch points retain termination codons, i.e., TAA, 
5 TAG or TGA. Moreover, portions of the molecule outside of the splice donor and splice 
acceptor regions are more amenable to change. In this regard, it is preferable to retain the 
5' GT found at the 5 ! splice junction, and preferably the first six base pairs found at the 5' 
splice junction. It is also preferable to retain the 3 ! AG found at the 3 ! splice junction, 
preferably the three base pairs, CAG, found at the 3 f splice junction. The nucleotides 
10 found in these regions are preferably at least 80% homologous to the sequence of 
nucleotides present in the native sequence shown in Figure 1 A, but may be less 
homologous as long as the Intron A fragment retains function, as defined above. Further, 
the polypyrimidine tract region is preferably one where substantially all of the bases are 
Ts or Cs. 

1 5 The terms "polypeptide" and "protein" refer to a polymer of amino acid residues 

and are not limited to a minimum length of the product. Thus, peptides, oligopeptides, 
dimers, multimers, and the like, are included within the definition. Both full-length 
proteins and fragments thereof are encompassed by the definition. The terms also include 
postexpression modifications of the polypeptide, for example, glycosylation, acetylation, 

20 phosphorylation and the like. 

For purposes of the present invention, the polypeptide expressed by the coding 
sequence may be one useful in a vaccine, therapeutic or diagnostic and may be derived 
from any of several known viruses, bacteria, parasites and fungi, as well as any of the 
various tumor antigens. Alternatively, the expressed polypeptide may be a therapeutic 

25 hormone, a transcription or translation mediator, an enzyme, an intermediate in a 
metabolic pathway, an immunomodulator, and the like. 

Furthermore, for purposes of the present invention, a "polypeptide" refers to a 
protein which includes modifications, such as deletions, additions and substitutions 
(generally conservative in nature), to the native sequence, so long as the protein maintains 

30 the desired activity. These modifications may be deliberate, as through site-directed 
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mutagenesis, or may be serendipitous, such as through mutations of hosts which produce 
the proteins or errors due to PCR amplification. 

A "coding sequence" or a sequence which "encodes" a selected polypeptide, is a 
nucleic acid molecule which is transcribed (in the case of DNA) and translated (in the 

5 case of mRNA) into a polypeptide in vitro or in vivo when placed under the control of 
appropriate regulatory sequences. The boundaries of the coding sequence are determined 
by a start codon at the 5' (amino) terminus and a translation stop codon at the 3 1 (carboxy) 
terminus. A coding sequence can include, but is not limited to, cDNA from viral, 
procaryotic or eucaryotic mRNA, genomic DNA sequences from viral (e.g. DNA viruses 

10 and retroviruses) or procaryotic DNA, and synthetic DNA sequences. A transcription 
termination sequence may be located 3' to the coding sequence. 

A "nucleic acid" molecule can include both double- and single-stranded 
sequences and refers to, but is not limited to, cDNA from viral, procaryotic or eucaryotic 
mRNA, genomic DNA sequences from viral (e.g. DNA viruses and retroviruses) or 

15 procaryotic DNA, and especially synthetic DNA sequences. The term also captures 
sequences that include any of the known base analogs of DNA and RNA. 

"Operably linked" refers to an arrangement of elements wherein the components 
so described are configured so as to perform their desired function. Thus, a given 
promoter operably linked to a coding sequence is capable of effecting the expression of 

20 the coding sequence when the proper transcription factors, etc., are present. The 

promoter need not be contiguous with the coding sequence, so long as it functions to 
direct the expression thereof. Thus, for example, intervening untranslated yet transcribed 
sequences can be present between the promoter sequence and the coding sequence, as can 
transcribed introns, and the promoter sequence can still be considered "operably linked" 

25 to the coding sequence. 

"Recombinant" as used herein to describe a nucleic acid molecule means a 
polynucleotide of genomic, cDNA, viral, semisynthetic, or synthetic origin which, by 
virtue of its origin or manipulation is not associated with all or a portion of the 
polynucleotide with which it is associated in nature. The term "recombinant" as used 

30 with respect to a protein or polypeptide means a polypeptide produced by expression of a 
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recombinant polynucleotide. In general, the gene of interest is cloned and then expressed 
in transformed organisms, as described further below. The host organism expresses the 
foreign gene to produce the protein under expression conditions. 

A "control element" refers to a polynucleotide sequence which aids in the 
5 expression of a coding sequence to which it is linked. The term includes promoters, 
transcription termination sequences, upstream regulatory domains, polyadenylation 
signals, untranslated regions, including 5 ! -UTRs (such as Exon 2 of the hCMV 
enhancer/promoter region 5 ! -UTR) and 3 ! -UTRs and when appropriate, leader sequences 
and enhancers, which collectively provide for the transcription and translation of a coding 

1 0 sequence in a host cell. 

A "promoter" as used herein is a DNA regulatory region capable of binding RNA 
polymerase in a host cell and initiating transcription of a downstream (3 1 direction) 
coding sequence operably linked thereto. For purposes of the present invention, a 
promoter sequence includes the minimum number of bases or elements necessary to 

15 initiate transcription of a gene of interest at levels detectable above background. Within 
the promoter sequence is a transcription initiation site, as well as protein binding domains 
(consensus sequences) responsible for the binding of RNA polymerase. Eucaryotic 
promoters will often, but not always, contain "TATAA" boxes and "CAAT" boxes. 

A control sequence "directs the transcription" of a coding sequence in a cell when 

20 RNA polymerase will bind the promoter sequence and transcribe the coding sequence 
into mRNA, which is then translated into the polypeptide encoded by the coding 
sequence. 

A "host cell" is a cell which has been transformed, or is capable of transformation, 
by an exogenous DNA sequence. 

25 A "heterologous" region of a DNA construct is an identifiable segment of DNA 

within or attached to another DNA molecule that is not found in association with the 
other molecule in nature. For example, a sequence encoding a human protein other than 
the immediate-early 72,000 molecular weight protein of hCMV is considered a 
heterologous sequence when linked to an hCMV IE1 enhancer/promoter. Similarly, a 

30 sequence encoding the immediate-early 72,000 molecular weight protein of hCMV will 
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be considered heterologous when linked to an hCMV promoter with which it is not 
normally associated. Another example of a heterologous coding sequence is a construct 
where the coding sequence itself is not found in nature (e.g., synthetic sequences having 
codons different from the native gene). Allelic variation or naturally occurring 

5 mutational events do not give rise to a heterologous region of DNA, as used herein. 

By "selectable marker" is meant a gene which confers a phenotype on a cell 
expressing the marker, such that the cell can be identified under appropriate conditions. 
Generally, a selectable marker allows selection of transected cells based on their ability to 
thrive in the presence or absence of a chemical or other agent that inhibits an essential 

10 cell function. Suitable markers, therefore, include genes coding for proteins which confer 
drug resistance or sensitivity thereto, impart color to, or change the antigenic 
characteristics of those cells transfected with a nucleic acid element containing the 
selectable marker when the cells are grown in an appropriate selective medium. For 
example, selectable markers include: cytotoxic markers and drug resistance markers, 

1 5 whereby cells are selected by their ability to grow on media containing one or more of the 
cytotoxins or drugs; auxotrophic markers by which cells are selected by their ability to 
grow on defined media with or without particular nutrients or supplements, such as 
thymidine and hypoxanthine; metabolic markers by which cells are selected for, e.g., their 
ability to grow on defined media containing the appropriate sugar as the sole carbon 

20 source, or markers which confer the ability of cells to form colored colonies on 

chromogenic substrates or cause cells to fluoresce. Representative selectable markers are 
described in more detail below. 

"Expression cassette" or "expression construct" refers to an assembly which is 
capable of directing the expression of the sequence(s) or gene(s) of interest. The 

25 expression cassette includes control elements, as described above, such as a promoter or 
promoter/enhancer (such as the hCMV IE1 enhancer/promoter) which is operably linked 
to (so as to direct transcription of) the sequence(s) or gene(s) of interest, and often 
includes a polyadenylation sequence as well. An expression cassette will also include an 
Intron A fragment as defined above and, optionally, Exon 2 of the hCMV IE1 

30 enhancer/promoter region. Within certain embodiments of the invention, the expression 
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cassette described herein may be contained within a plasmid construct. In addition to the 
components of the expression cassette, the plasmid construct may also include, one or 
more selectable markers, a signal which allows the plasmid construct to exist as single- 
stranded DNA (e.g., a M13 origin of replication), at least one multiple cloning site, and a 

5 "mammalian" origin of replication (e.g., a SV40 or adenovirus origin of replication). 
"Transformation," as used herein, refers to the insertion of an exogenous 
polynucleotide into a host cell, irrespective of the method used for insertion: for 
example, transformation by direct uptake, transfection, infection, and the like. For 
particular methods of transfection, see further below. The exogenous polynucleotide may 

10 be maintained as a nonintegrated vector, for example, an episome, or alternatively, may 
be integrated into the host genome. 

By "isolated" is meant, when referring to a polypeptide, that the indicated 
molecule is separate and discrete from the whole organism with which the molecule is 
found in nature or is present in the substantial absence of other biological macro- 

15 molecules of the same type. The term "isolated" with respect to a polynucleotide is a 

nucleic acid molecule devoid, in whole or part, of sequences normally associated with it 
in nature; or a sequence, as it exists in nature, but having heterologous sequences in 
association therewith; or a molecule disassociated from the chromosome. 

"Homology" refers to the percent identity between two polynucleotide or two 

20 polypeptide moieties. Two DNA, or two polypeptide sequences are "substantially 

homologous" to each other when the sequences exhibit at least about 50% , preferably at 
least about 75%, more preferably at least about 80%-85% (80, 81, 82, 83, 84, 85%), 
preferably at least about 90%, and most preferably at least about 95%-98% (95, 96, 97, 
98%o), or more, or any integer within the range of 50% to 100%, sequence identity over a 

25 defined length of the molecules. As used herein, substantially homologous also refers to 
sequences showing complete identity to the specified DNA or polypeptide sequence. 

In general, "identity" refers to an exact nucleotide-to-nucleotide or amino acid-to- 
amino acid correspondence of two polynucleotides or polypeptide sequences, 
respectively. Percent identity can be determined by a direct comparison of the sequence 

30 information between two molecules by aligning the sequences, counting the exact number 
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of matches between the two aligned sequences, dividing by the length of the shorter 
sequence, and multiplying the result by 100. Readily available computer programs can 
be used to aid in the analysis, such as ALIGN, Dayhoff, M.O. in Atlas of Protein 
Sequence and Structure M.O. Dayhoff ed., 5 SuppL 3:353-358, National biomedical 
5 Research Foundation, Washington, DC, which adapts the local homology algorithm of 
Smith and Waterman Advances inAppl Math. 2:482-489, 1981 for peptide analysis. 
Programs for determining nucleotide sequence identity are available in the Wisconsin 
Sequence Analysis Package, Version 8 (available from Genetics Computer Group, 
Madison, WI) for example, the BESTFIT, FASTA and GAP programs, which also rely 
1 0 on the Smith and Waterman algorithm. These programs are readily utilized with the 
default parameters recommended by the manufacturer and described in the Wisconsin 
Sequence Analysis Package referred to above. For example, percent identity of a 
particular nucleotide sequence to a reference sequence can be determined using the 
homology algorithm of Smith and Waterman with a default scoring table and a gap 
1 5 penalty of six nucleotide positions. 

Another method of establishing percent identity in the context of the present 
invention is to use the MPSRCH package of programs copyrighted by the University of 
Edinburgh, developed by John F. Collins and Shane S. Sturrok, and distributed by 
IntelliGenetics, Inc. (Mountain View, CA). From this suite of packages the Smith- 
20 Waterman algorithm can be employed where default parameters are used for the scoring 
table (for example, gap open penalty of 12, gap extension penalty of one, and a gap of 
six). From the data generated the "Match" value reflects "sequence identity." Other 
suitable programs for calculating the percent identity or similarity between sequences are 
generally known in the art, for example, another alignment program is BLAST, used with 
25 default parameters. For example, BLASTN and BLASTP can be used using the 

following default parameters: genetic code = standard; filter = none; strand = both; cutoff 
= 60; expect =10; Matrix = BLOSUM62; Descriptions = 50 sequences; sort by = HIGH 
SCORE; Databases = non-redundant, GenBank + EMBL + DDBJ + PDB + GenBank 
CDS translations + Swiss protein + Spupdate + PIR. Details of these programs can be 
30 found at the following internet address: http://www.ncbi.nlm.gov/cgi-bin/BLAST. 
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Alternatively, homology can be determined by hybridization of polynucleotides 
under conditions which form stable duplexes between homologous regions, followed by 
digestion with single-stranded-specific nuclease(s), and size determination of the digested 
fragments. DNA sequences that are substantially homologous can be identified in a 
5 Southern hybridization experiment under, for example, stringent conditions, as defined 
for that particular system. Defining appropriate hybridization conditions is within the 
skill of the art. See, e.g., Sambrook et al., supra; DNA Cloning, supra; Nucleic Acid 
Hybridization, supra, 

10 II. Modes of Carrying out the Invention 

Before describing the present invention in detail, it is to be understood that this 
invention is not limited to particular formulations or process parameters as such may, of 
course, vary. It is also to be understood that the terminology used herein is for the 
purpose of describing particular embodiments of the invention only, and is not intended 

15 to be limiting. 

Although a number of compositions and methods similar or equivalent to those 
described herein can be used in the practice of the present invention, the preferred 
materials and methods are described herein. 

As noted above, the present invention is based on the discovery of novel hCMV 

20 Intron A fragments which are able to enhance expression of a downstream (3 1 ) sequence 
relative to expression levels achieved in the absence of an Intron A sequence, or at least 
provide for equivalent expression levels as those obtained using the intact, full-length 
Intron A sequence. As explained above, the hCMV IE1 enhancer/promoter from which 
the Intron A sequence is derived, is one of the strongest enhancer/promoters known and is 

25 active in a broad range of cell types. See, e.g., Chapman et al, Nuc. Acids Res. (1991) 
19:3979-3986; and U.S. Patent No. 5,688,688. The use of active fragments from this 
region effectively reduces the overall plasmid size for expression of a particular coding 
sequence. This is particularly desirable when large coding sequences, and/or viral vectors 
with limited ability to package large genes, are used. Moreover, the decrease in overall 

30 size of the constructs effectively enhances efficiency of expression. Thus, the Intron A 
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fragments of the present invention surprisingly retain the ability to result in expression of 
protein at high levels in vitro and in vivo and, in some cases, provide for higher 
expression than vectors using the entire hCMV 1E1 Intron A sequence. As shown in the 
examples, these high levels of expression have provided for immune responses that are 

5 comparable to, or even better than, that induced by the parent vector. 

As explained above, the Intron A fragments for use herein will retain at least up to 
the initial 7 nucleotides of the Intron A region, preferably at least up to the initial 25 
nucleotides of the Intron A region (see, Figure 1 A for a representative Intron A 
sequence). In general, the Intron A fragment of the present invention will retain at least 

10 up to the first 30 nucleotides of the Intron A region (nucleotides 1-30 of Figure 1 A), 

generally at least up to the first 40 nucleotides of the Intron A region (nucleotides 1-40 of 
Figure 1A), more preferably at least up to the first 51 nucleotides of the Intron A region 
(nucleotides 1-51 of Figure 1A), and even up to the first 75 or more nucleotides of the 
Intron A region. Thus, the 5 ! -region may include 25, 26, 27, 28, 29, 30. ..50, 51, 52, 53, 

15 54, 55...70, 71, 72, 73, 74, 7S...85, 86, 87 or more of the 5'-nucleotides, and so on. It is 
evident that any number of nucleotides specified above, as well as nucleotides falling 
within the specified numbers, are intended to be encompassed herein, so long as an 
expression construct containing the Intron A fragment functions as defined above. 

The Intron A fragment will optionally also include a sufficient amount of the 3'- 

20 region of Intron A to function as described herein. Generally, then, the Intron A fragment 
will include at least the 3 1 splice junction sequence (nucleotides 815-820 of Figure 1A), 
preferably, at least up to the 25 3 '-nucleotides of the Intron A sequence shown in Figure 
1 A (nucleotides 796-820 of Figure 1 A), preferably up to the 50 3 ? -nucleotides of the 
sequence shown in Figure 1A (nucleotides 771-820 of Figure 1 A), more generally up to 

25 the 70 3'-nucleotides (nucleotides 751-820 of Figure 1 A), preferably at least up to the 80 
3 '-nucleotides (nucleotides 741-820 of Figure 1A), or even more of the 3 ! -region, such as 
the 100-150 3 '-nucleotides, and any integer between these values, or more of the 3'-region 
of Intron A. Thus, the 3'-portion of the Intron A fragment may include 50, 51, 52, 53, 54, 
55...70, 71, 72, 73, 74, 75...8S, 86, 87...90, 92, 93, 94, 95, 96....110, 111, 112, and so on, 

30 or more of the S'-nucleotides of the Intron A region. It is evident that any number of 
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nucleotides specified above, as well as nucleotides falling within the specified numbers, 
are intended to be encompassed herein. 

The 5 1 - and 3'-retained regions of the Intron A fragment of the present invention 
may be directly linked to one another, e.g., there may be an internal deletion of the Intron 
A sequence. This deletion may comprise, for example, 10-750 or more base pairs of the 
intact Intron A region, preferably about 300-700 base pairs, and most preferably about 
500-700 base pairs. As shown in Figure 1 A, one preferable fragment includes a large 
internal deletion of about 688 base pairs. This fragment therefore includes the sequence 
of nucleotides at positions 1-51 directly linked to nucleotides 741-834, of Figure 1A, thus 
comprising an internal deletion of nucleotides 52-740 of Intron A, as shown in Figure 1 A. 
Nucleotides 821-834 of Figure 1A represent Exon 2 of the 5 ? -UTR. Figure 3 shows 
various Intron A fragment constructs with Intron A deletions ranging from 55 to 661 base 
pairs. 

Alternatively, the 5 ! - and 3 ! -regions of the Intron A fragment may be linked 
together via a linker sequence. The linker sequence may comprise from 1 up to about 
400 or more nucleotides, preferably from 10-100 nucleotides, or any integer between 
these values, and may comprise regions for enhancers, particular transcript factors, such 
as NF1 binding sites, and the like. 

The Intron A fragment of the present invention can be isolated from a CMV 
genomic library, as well as from plasmids containing the Intron A region, using an 
appropriate prffobe and cloned for future use. Similarly, the sequence can be produced 
synthetically, using known methods of polynucleotide synthesis (see, e.g. Edge, M.D., 
Nature (1981) 292:756; Nambair, et al Science (1984) 223:1299; Jay, Ernest, J. Biol 
Chem. (1 984) 259 :63 1 1), based on the known Intron A sequence. See, e.g., Chapman et 
al., Nuc. Acids Res. (1991) 19:3979-3986 for the sequence and structure of the Intron A 
region in hCMV strain Towne, and Akrigg et aL, Virus Res. (1985) 2:107-121, for a 
description of the corresponding region in hCMV strain AD169; and Figures 1 A and 1C 
herein. 

One particularly convenient method for obtaining the Intron A fragment of the 
present invention is to isolate Intron A (either alone, or in association with the rest of the 
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hCMV enhancer/promoter region) from any of the various plasmids known to contain the 
same, using techniques well known in the art, as well as described in the examples herein. 
In particular, hCMV Intron A can be obtained from plasmid pCMV6, as described in 
Chapman et al, Nua Acids Res. (1991) 19:3979-3986 and U.S. Patent No. 5,688,688, 
5 incorporated herein by reference in its entirety. Once obtained, the Intron A sequence can 
be manipulated to obtain deletion mutants thereof, such as by excising portions of the 
Intron A sequence using restriction enzymes. Site specific DNA cleavage is performed 
by treatment with a suitable restriction enzyme (or enzymes), under conditions which are 
generally understood in the art, and the particulars of which are specified by the 

10 manufacturer of these commercially available enzymes. See, e.g., New England Biolabs, 
Product Catalog. For example, restriction endonucleases with various specificities have 
been isolated from a wide range of prokaryotes and are well known in the art. See, e.g., 
Sambrook et al., supra. The choice of an appropriate restriction endonuclease depends on 
the particular sequence targeted. One of skill in the art will readily recognize the proper 

15 restriction enzyme to use for a desired sequence. If desired, size separation of the cleaved 
fragments may be performed by polyacrylamide gel or agarose gel electrophoresis, using 
standard techniques. A general description of size separations is found in, e.g., Sambrook 
et al., supra. The Intron A sequence can then be ligated to other control sequences such 
as an appropriate promoter (if the Intron A is isolated without the remaining hCMV IE1 

20 enhancer/promoter region), and the desired coding sequence, using known techniques. 

The sequence of the Intron A fragment can be optimized for use in particular 
expression systems using techniques well known in the art. Additionally, portions of the 
sequence of the fragment may be changed, e.g., by deleting or substituting possible 
branch points, as well as other regions of the molecule. These regions of a representative 

25 Intron A are shown in Figure 1 A. One particular optimized sequence of the Intron A 
fragment is shown in Figure 1C. As explained in the examples, this fragment was 
obtained by first deleting most of the 3-sequence of the Intron A region and then 
substituting, by means of a synthetic oligonucleotide, the last 80 nucleotides of the Intron 
A region with an optimized sequence, and including Exon 2 of the 5 ! -UTR region. The 

30 optimized sequence was based on published branch point and polypyrimidine track 
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consensus sequences. Alternatively, mutagenized sequences can be obtained by 
techniques well known in the art, such as site-directed mutagenesis and polymerase chain 
reaction (PCR) techniques where appropriate. See, e.g., Sambrook, supra. 

Once obtained, the fragment can be used to direct the transcription of a desired 
5 protein in a wide variety of cell types. Cis-acting control elements can be conveniently 
associated with the Intron A fragment in order to optimize expression of the coding 
sequence associated therewith. If proteins produced in the system are either naturally 
secreted or engineered to be, the transformed cells may produce the protein product for 
protracted time periods, further increasing yields. The system allows for the production 

10 of a desired protein in an authentic configuration, with authentic post-translation 
modifications, in a relatively pure form and in economically useful amounts. 

Thus, the Intron A fragments of the present invention will find use in expression 
constructs to express a wide variety of substances, including peptides which act as 
antibiotics and antiviral agents, e.g., immunogenic peptides for use in vaccines and 

15 diagnostics; recombinant antibodies; antineoplastics; immunomodulators, such as any of 
the various cytokines including interleukin-1, interleukin-2, interleukin-3, interleukin-4, 
and gamma-interferon; peptide hormones such as insulin, proinsulin, growth hormone, 
GHRH, LHRH, EGF, somatostatin, SNX-1 11, BNP, insulinotropin, AMP, FSH, LH, PSH 
and hCG, gonadal steroid hormones (androgens, estrogens and progesterone), thyroid- 

20 stimulating hormone, inhibin, cholecystokinin, ACTH, CRF, dynorphins, endorphins, 
endothelin, fibronectin fragments, galanin, gastrin, insulinotropin, glucagon, GTP- 
binding protein fragments, guanylin, the leukokinins, magainin, mastoparans, 
dermaseptin, systemin, neuromedins, neurotensin, pancreastatin, pancreatic polypeptide, 
substance P, secretin, thymosin, and the like; and growth factors, such as PDGF, EGF, 

25 KGF, IGF-1 and IGF-2, FGF, and the like. 

More particularly, proteins for use in vaccines and diagnostics may be of viral, 
bacterial, fungal or parasitic origin, including but not limited to, those encoded by human 
and animal viruses and can correspond to either structural or non-structural proteins. For 
example, the present system will find use for recombinantly producing a wide variety of 

30 proteins from the herpesvirus family, including proteins derived from herpes simplex 
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virus (HSV) types 1 and 2, such as HSV-1 and HSV-2 glycoproteins gB, gD and gH; 
proteins derived from varicella zoster virus (VZV), Epstein-Barr virus (EBV) and 
cytomegalovirus (CMV) including CMV gB and gH; and proteins derived from other 
human herpesviruses such as HHV6 and HHV7. (See, e.g. Chee et al., 

5 Cytomegaloviruses (J.K. McDougall, ed. ? Springer-Verlag 1990) pp. 125-169, for a 
review of the protein coding content of cytomegalovirus; McGeoch et al., J. Gen. Virol. 
(1988) 69:1531-1574, for a discussion of the various HSV-1 encoded proteins; U.S. 
Patent No. 5,171,568 for a discussion of HSV-1 and HSV-2 gB and gD proteins and the 
genes encoding therefor; Baer et al., Nature (1984) 310:207-211, for the identification of 

10 protein coding sequences in an EBV genome; and Davison and Scott, J. Gen. Virol 
(1986) 67:1759-1816, for a review of VZV.) 

Polynucleotide sequences encoding proteins from the hepatitis family of viruses, 
including hepatitis A virus (HAV), hepatitis B virus (HBV), hepatitis C virus (HCV), the 
delta hepatitis virus (HDV), hepatitis E virus (HEV) and hepatitis G virus (HGV), can 

1 5 also be conveniently used in the techniques described herein. By way of example, the 
viral genomic sequence of HCV is known, as are methods for obtaining the sequence. 
See, e.g., International Publication Nos. WO 89/04669; WO 90/1 1089; and WO 
90/14436. The HCV genome encodes several viral proteins, including El (also known as 
E) and E2 (also known as E2/NSI). (See, Houghton et al, Hepatology (1991) 14:381- 

20 388, for a discussion of HCV proteins, including El and E2.) The sequences encoding 
each of these proteins, as well as antigenic fragments thereof, will find use in the present 
system. Similarly, the coding sequence for the 8-antigen from HDV is known (see, e.g., 
U.S. Patent No. 5,378,814) and this sequence can also be conveniently used in the present 
system. Additionally, antigens derived from HBV, such as the core antigen, the surface 

25 antigen, sAg, as well as the presurface sequences, preSl and preS2 (formerly called 
preS), as well as combinations of the above, such as sAg/preSl, sAg/preS2, 
sAg/preSl/preS2, and preSl/preS2, will find use herein. See, e.g., "HBV Vaccines - 
from the laboratory to license: a case study" in Mackett, M. and Williamson, J.D., Human 
Vaccines and Vaccination, pp. 159-176, for a discussion of HBV structure; Beames et al, 
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Virol (1995) 69:6833-6838, Bimbaum et al., 1 Virol (1990) 64:3319-3330; and Zhou 
et al. 5 1 Virol (1991)65:5457-5464. 

Polynucleotide sequences encoding proteins derived from other viruses will also 
find use in the expression systems, such as without limitation, proteins from members of 
the families Picornaviridae (e.g., polioviruses, etc.); Caliciviridae; Togaviridae (e.g., 
rubella virus, dengue virus, etc.); Flaviviridae; Coronaviridae; Reoviridae; Birnaviridae; 
Rhabodoviridae (e.g., rabies virus, etc.); Filoviridae; Paramyxoviridae (e.g., mumps 
virus, measles virus, respiratory syncytial virus, etc.); Orthomyxoviridae (e.g., influenza 
virus types A, B and C, etc.); Bunyaviridae; Arenaviridae; Retroviradae (e.g., HTLV-I; 
HTLV-II; HIV-1 (also known as HTLV-III, LAV, ARV, hTLR, etc.)), including but not 
limited to antigens from the isolates HIV IIIb , HIV SF2 , HIV LAV , HIV LAI , HIV MN ); HIV- 
1 CM235 , HIV-1 US4 ; HIV-2; simian immunodeficiency virus (SIV) among others. See, e.g. 
Virology, 3rd Edition (W.K. Joklik ed. 1988); Fundamental Virology, 2nd Edition (B.N. 
Fields and D.M. Knipe, eds. 1991), for a description of these and other viruses. 

For example, the invention may be used in expression constructs to express genes 
encoding the gpl20 envelope protein from any of the above HIV isolates. The gpl20 
sequences for a multitude of HIV-1 and HIV-2 isolates, including members of the various 
genetic subtypes of HIV, are known and reported (see, e.g., Myers et al, Los Alamos 
Database, Los Alamos National Laboratory, Los Alamos, New Mexico (1992); Myers et 
al., Human Retroviruses and Aids, 1990, Los Alamos, New Mexico: Los Alamos 
National Laboratory; and Modrow et al., J. Virol (1987) 61 :570-578, for a comparison of 
the envelope gene sequences of a variety of HIV isolates) and sequences derived from 
any of these isolates will find use in the present methods. Furthermore, the invention is 
equally applicable to other immunogenic proteins derived from any of the various HIV 
isolates, including any of the various envelope proteins such as gpl60 and gp41, gag 
antigens such as p24gag and p55gag, as well as proteins derived from the pol region. 

The present invention will also find use in expression constructs for the 
expression of influenza virus proteins. Specifically, the envelope glycoproteins HA and 
NA of influenza A are of particular interest for generating an immune response. 
Numerous HA subtypes of influenza A have been identified (Kawaoka et al., Virology 
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(1990) 179:759-767; Webster et al., "Antigenic variation among type A influenza 
viruses/ 1 p. 127-168. In: P. Palese and D.W. Kingsbury (ed.), Genetics of influenza 
viruses. Springer- Verlag, New York). Thus, the gene sequences encoding proteins 
derived from any of these isolates can also be used in the recombinant production 
techniques described herein. 

Furthermore, the fragments described herein provide a means for producing 
proteins useful for treating a variety of malignant cancers. For example, the system of the 
present invention can be used to produce a variety of tumor antigens which in turn may 
be used to mount both humoral and cell-mediated immune responses to particular 
proteins specific to the cancer in question, such as an activated oncogene, a fetal antigen, 
or an activation marker. Such tumor antigens include any of the various MAGEs 
(melanoma associated antigen E), including MAGE 1, 2, 3, 4, etc. (Boon, T. Scientific 
American (March 1993):82-89); any of the various tyrosinases; MART 1 (melanoma 
antigen recognized by T cells), mutant ras; mutant p53; p97 melanoma antigen; CEA 
(carcinoembryonic antigen), among others. 

It is readily apparent that the subject invention can be used to produce a variety of 
proteins useful for the prevention, treatment and/or diagnosis of a wide variety of 
diseases. 

Polynucleotide sequences coding for the above-described molecules can be 
obtained using recombinant methods, such as by screening cDNA and genomic libraries 
from cells expressing the gene, or by deriving the gene from a vector known to include 
the same. Furthermore, the desired gene can be isolated directly from cells and tissues 
containing the same, using standard techniques, such as phenol extraction and PCR of 
cDNA or genomic DNA. See, e.g., Sambrook et al, supra, for a description of 
techniques used to obtain and isolate DNA. The gene of interest can also be produced 
synthetically, rather than cloned. The nucleotide sequence can be designed with the 
appropriate codons for the particular amino acid sequence desired. In general, one will 
select preferred codons for the intended host in which the sequence will be expressed. 
The complete sequence may be assembled from overlapping oligonucleotides prepared by 
standard methods and assembled into a complete coding sequence. See, e.g., Edge, 
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Nature (1981) 292:756; Nambair et al., Science (1984) 223:1299; Jay et al, 1 Biol 
Chem. (1984) 259:6311. 

Markers and amplifiers can also be employed in the subject expression systems. 
A variety of markers are known which are useM in selecting for transformed cell lines 

5 and generally comprise a gene whose expression confers a selectable phenotype on 
transformed cells when the cells are grown in an appropriate selective medium. Such 
markers for mammalian cell lines include, for example, the bacterial xanthine-guanine 
phosporibosyl transferase gene, which can be selected for in medium containing 
mycophenolic acid and xanthine (Mulligan et al (1981) Proc. Natl Acad. Sci. USA 

1 0 78 :2072-2076), and the aminoglycoside phosphotransferase gene (specifying a protein 
that inactivates the antibacterial action of neomycin/kanamycin derivatives), which can 
be selected for using medium containing neomycin derivatives such as G418 which are 
normally toxic to mammalian cells (Colbere-Garapin et al (1981) /. Mol Biol 150:1- 
14). Useful markers for other expression systems, are well known to those of skill in the 

15 art. These and other selectable markers can be obtained from commercially available 
plasmids, using techniques well known in the art. See, e.g., Sambrook et al., supra. 

Expression can also be amplified by placing an amplifiable gene, such as the 
mouse dihydrofolate reductase (dhfr) gene adjacent to the coding sequence. Cells can 
then be selected for methotrexate resistance in dhfr-deficient cells. See, e.g. Urlaub et al. 

20 (1980) Proc. Natl Acad. Sci. USA 77:4216-4220; Ringold et al. (1981) J. Mol. andAppl 
Genet. 1:165-175. Constructs which include both markers and amplifiers will also find 
use in the subject expression vectors, such as any of the various EMCV-DHFR/Neo 
constructs described in, e.g., U.S. Patent No. 6,096,505, incorporated herein by reference 
in its entirety. 

25 Transcription termination and polyadenylation sequences may also be present, 

located 3' to the translation stop codon for the coding sequence. Examples of 
transcription terminator/polyadenylation signals include, but are not limited to, those 
derived from SV40, as described in Sambrook et aL, supra, as well as a bovine growth 
hormone terminator sequence. 
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Also present in the expression constructs of the invention will be a promoter 
region. The promoter may be the homologous hCMV IE1 promoter normally associated 
with the intact, full-length Intron A sequence from which the fragment is derived, a 
heterologous CMV IE1 promoter (e.g., from a different CMV strain), or even a non-CMV 
IE1 promoter. The choice of promoter will depend on the cell type used for expression 
and is readily determined by one of skill in the art. For example, typical promoters for 
mammalian cell expression include the SV40 early promoter, a CMV promoter as 
described above, the mouse mammary tumor virus LTR promoter, the adenovirus major 
late promoter (Ad MLP), the RSV promoter, the SRa promoter, the herpes simplex virus 
promoter, tissue-specific promoters, among others. One particular promoter used in the 
constructs described herein is a promoter derived from the hCMV IE1 enhancer/promoter 
region depicted in Figure 2, such as approximately nucleotide positions 460 to 1264 of 
Figure 2, or functional portions of this region. Other nonviral promoters, such as a 
promoter derived from the murine metallothionein gene, will also find use for 
mammalian expression. Insect cell expression systems, typically Baculovirus systems, 
will generally include a polyhedrin promoter. Promoters for use in bacterial systems 
include promoter sequences derived from sugar metabolizing enzymes, such as galactose, 
lactose (lac) (Chang et al., Nature (1977) 198:1056), and maltose, promoter sequences 
derived from biosynthetic enzymes such as tryptophan (trp) (Goeddel et al., Nuc. Acids 
Res. (1980) 8:4057; Yelverton et al., Nucl Acids Res. (1981) 9:731; U.S. Patent 
No. 4,738,921; EPO Publication Nos. 036,776 and 121,775), the b-lactamase (bid) 
promoter system (Weissmann (1981) "The cloning of interferon and other mistakes" in 
Interferon 3 (ed. L Gresser)), bacteriophage lambda PL (Shimatake et al, Nature (1981) 
292:128), the T5 promoter (U.S. Patent No. 4,689,406), hybrid promoters such as tac, a 
hybrid trp-lac promoter (Amann et al., Gene (1983) 25:167; de Boer et al, Proc. Natl. 
Acad. Set (1983) 80:21). Promoters useful in yeast expression systems include, for 
example, promoters from sequences encoding enzymes in the metabolic pathway such as 
alcohol dehydrogenase (ADH) (EPO Publication No. 284,044), enolase, glucokinase, 
glucose-6-phosphate isomerase, glyceraldehyde-3-phosphate-dehydrogenase (GAP or 
GAPDH), hexokinase, phosphofructokinase, 3-phosphoglycerate mutase, and pyruvate 
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kinase (PyK) (EPO Publication No. 329,203) promoters. Other promoters for use in such 
systems include a promoter derived from the yeast PH05 gene, encoding acid 
phosphatase (Myanohara et al., Proc. Natl Acad. ScL USA (1983) 80:1), as well as 
synthetic such as a promoter formed by the fusion of UAS sequences of one yeast 
promoter with the transcription activation region of another yeast promoter, creating a 
synthetic hybrid promoter. Examples of such hybrid promoters include the ADH 
regulatory sequence linked to the GAP transcription activation region (U.S. Patent Nos. 
4,876,197 and 4,880,734), as well as promoters which consist of the regulatory sequences 
of either the ADH2, GAL4, GAL10, or PH05 genes, combined with the transcriptional 
activation region of a glycolytic enzyme gene such as GAP or PyK (EPO Publication No. 
164,556). These and other promoters can be obtained from commercially available 
plasmids, using techniques well known in the art. See, e.g., Sambrook et al., supra. 

An expression vector is constructed so that the particular coding sequence is 
located in the vector with the Intron A fragment and the appropriate regulatory sequences, 
the positioning and orientation of the coding sequence with respect to the control 
sequences being such that the coding sequence is transcribed under the "control" of the 
control sequences (i.e., RNA polymerase which binds to the DNA molecule at the 
promoter transcribes the coding sequence). Modification of the sequences encoding the 
molecule of interest may be desirable to achieve this end. For example, in some cases it 
may be necessary to modify the sequence so that it can be attached to the promoter and 
other control sequences in the appropriate orientation; i.e., to maintain the reading frame. 
The promoter sequence and other regulatory sequences may be ligated to the coding 
sequence prior to insertion into a vector. Alternatively, the coding sequence can be 
cloned directly into an expression vector which already contains the Intron A fragment 
and an appropriate restriction site. 

It may also be desirable to produce mutants or analogs of the gene of interest. 
Mutants or analogs of the polypeptide of interest may be prepared by the deletion of a 
portion of the sequence encoding the polypeptide of interest, by insertion of a sequence, 
and/or by substitution of one or more nucleotides within the sequence. Techniques for 
modifying nucleotide sequences, such as site-directed mutagenesis, and the like, are well 
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known to those skilled in the art. See, e.g., Sambrook et al., supra; Kunkel, T.A. (1985) 
Proc. Natl Acad. Set USA (1985) 82:448; Geisselsoder et al. (1987) BioTechniques 
5:786; Zoller and Smith (1983) Methods Enzymol 100:468; Dalbie-McFarland et al. 
(1982) Proc. Natl Acad. Sci USA 79:6409. 

Once the expression constructs are assembled, they can be used in a wide variety 
of systems, including insect, mammalian, bacterial, viral and yeast expression systems, 
all well known in the art. Nucleic acid molecules comprising nucleotide sequences of 
interest can be stably integrated into a host cell genome or maintained on a stable 
episomal element in a suitable host cell using various gene delivery techniques well 
known in the art. See, e.g., U.S. Patent No. 5,399,346. 

For example, insect cell expression systems, such as baculovirus systems, are 
known to those of skill in the art and described in, e.g., Summers and Smith, Texas 
Agricultural Experiment Station Bulletin No. 1555 (1987). Materials and methods for 
baculovirus/insect cell expression systems are commercially available in kit form from, 
inter alia, Invitrogen, San Diego CA ("MaxBac" kit). Similarly, bacterial and 
mammalian cell expression systems are well known in the art and described in, e.g., 
Sambrook et al., supra. Yeast expression systems are also known in the art and described 
in, e.g., Yeast Genetic Engineering (Barr et al., eds., 1989) Butterworths, London. 

A number of appropriate host cells for use with the above systems are also known. 
For example, mammalian cell lines are known in the art and include immortalized cell 
lines available from the American Type Culture Collection (ATCC), such as, but not 
limited to, Chinese hamster ovary (CHO) cells, HeLa cells, baby hamster kidney (BHK) 
cells, monkey kidney cells (COS), human embryonic kidney cells, human hepatocellular 
carcinoma cells (e.g., Hep G2), Madin-Darby bovine kidney ("MDBK") cells, as well as 
others. Similarly, bacterial hosts such as E. coli, Bacillus subtilis, and Streptococcus 
spp., will find use with the present expression constructs. Yeast hosts useful in the 
present invention include inter alia, Saccharomyces cerevisiae, Candida albicans, 
Candida maltosa, Hansenula polymorpha, Kluyveromyces fragilis, Kluyveromyces lactis, 
Pichia guillerimondii, Pichia pastoris, Schizosaccharomyces pombe and Yarrowia 
lipolytica. Insect cells for use with baculovirus expression vectors include, inter alia, 
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Aedes aegypti, Autographa californica, Bombyx rnori, Drosophila melanogaster, 
Spodoptera frugiperda, and Trichoplusia ni. 

A wide variety of methods can be used to deliver the expression constructs to 
cells. Such methods include DEAE dextran-mediated transfection, calcium phosphate 
precipitation, polylysine- or polyornithine-mediated transfection, or precipitation using 
other insoluble inorganic salts, such as strontium phosphate, aluminum silicates including 
bentonite and kaolin, chromic oxide, magnesium silicate, talc, and the like. Other useful 
methods of transfection include electroporation, sonoporation, protoplast fusion, 
liposomes, peptoid delivery, or microinjection. See, e.g., Sambrook et al., supra, for a 
discussion of techniques for transforming cells of interest. 

For example, the expression constructs can be packaged in liposomes prior to 
delivery to the cells. Lipid encapsulation is generally accomplished using liposomes 
which are able to stably bind or entrap and retain nucleic acid. The ratio of condensed 
DNA to lipid preparation can vary but will generally be around 1 : 1 (mg 
DNA:micromoles lipid), or more of lipid. For a review of the use of liposomes as carriers 
for delivery of nucleic acids, see, Hug and Sleight, Biochim. Biophys. Acta. (1991) 
1097 :1-17; Straubinger et si., in Methods ofEnzymology (1983), Vol. 101, pp. 512-527. 

Liposomal preparations for use with the present invention include cationic 
(positively charged), anionic (negatively charged) and neutral preparations, with cationic 
liposomes particularly preferred. Cationic liposomes are readily available. For example, 
N[l-2,3-dioleyloxy)propyl]-N,N,N-triethylammonixim (DOTMA) liposomes are available 
under the trademark Lipofectin, from GIBCO BRL, Grand Island, NY. (See, also, 
Feigner et al., Proc. Natl Acad. Sci. USA (1987) 84:7413-7416). Other commercially 
available lipids include transfectace (DDAB/DOPE) and DOTAP/DOPE (Boerhinger). 
Other cationic liposomes can be prepared from readily available materials using 
techniques well known in the art. See, e.g., Szoka et al., Proc. Natl Acad. Set USA 
(1978) 75:4194-4198; PCT Publication No. WO 90/11092 for a description of the 
synthesis of DOTAP (l,2-bis(oleoyloxy)-3-(trimethylammonio)propane) liposomes. The 
various liposome-nucleic acid complexes are prepared using methods known in the art. 
See, e.g., Straubinger et al., in METHODS OF IMMUNOLOGY (1983), Vol. 101, pp. 
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512-527; Szoka et al., Proc. Natl. Acad, Sci. USA (1978) 75:4194-4198; Papahadjopoulos 
et al, Biochim. Biophys. Acta (1975) 394:483; Wilson et al, Cell (1979) 17:77); Deamer 
and Bangham, Biochim, Biophys. Acta (1976) 443:629; Ostro et al., Biochem. Biophys. 
Res. Commun. (1977) 76:836; Fraley et al., Proc. Natl. Acad. Set USA (1979) 76:3348); 
5 Enoch and Strittmatter, Proc. Natl Acad. Sci. USA (1979) 76: 145); Fraley et al, Biol. 
Chem. (1980) 255:10431; Szoka and Papahadjopoulos, Proc. Natl Acad. Sci. USA (1978) 
75:145; and Schaefer-Ridder et al., Science (1982) 215:166. 

The DNA can also be delivered in cochleate lipid compositions similar to those 
described by Papahadjopoulos et al, Biochem. Biophys. Acta. (1975) 394:483-491. See, 
10 also, U.S. Patent Nos. 4,663,161 and 4,871,488. 

Depending on the expression system and host selected, the molecules are 
produced by growing host cells transformed by an expression vector described above 
under conditions whereby the protein is expressed. The expressed protein is then isolated 
from the host cells and purified. If the expression system secretes the protein into growth 
1 5 media, the product can be purified directly from the media. If it is not secreted, it can be 
isolated from cell lysates. The selection of the appropriate growth conditions and 
recovery methods are within the skill of the art. For example, once expressed, the 
product may be isolated and purified by any number of techniques, well known in the art, 
including: chromatography, e.g., HPLC, affinity chromatography, ion exchange 
20 chromatography, size-exclusion, etc.; electrophoresis; density gradient centrifugation; 
solvent extraction, or the like. See, e.g., Protein Purification Principles and Practice, 
2nd edition (Robert K. Scopes ed. 1987); and Protein Purification Methods, a Practical 
Approach (ELM. Harris and S. Angal, eds. 1990). 

The expression constructs of the present invention may also be used for nucleic 
25 acid immunization and gene therapy, using standard gene delivery protocols. Methods 
for gene delivery are known in the art. See, e.g., U.S. Patent Nos. 5,399,346, 5,580,859, 
5,589,466, incorporated by reference herein in their entireties. Genes can be delivered 
either directly to the vertebrate subject or, alternatively, delivered ex vivo, to cells derived 
from the subject and the cells reimplanted in the subject. 
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A number of viral based systems have been developed for gene transfer into 
mammalian cells. For example, retroviruses provide a convenient platform for gene 
delivery systems. A selected gene can be inserted into a vector and packaged in retroviral 
particles using techniques known in the art. The recombinant virus can then be isolated 

5 and delivered to cells of the subject either in vivo or ex vivo, A number of retroviral 
systems have been described (U.S. Patent No. 5,219,740; Miller and Rosman, 
BioTechniques (1989) 7:980-990; Miller, A.D., Human Gene Therapy (1990) 1:5-14; 
Scarpa et al, Virology (1991) 180:849-852; Burns et al, Proc. Natl Acad. Set USA 
(1993) 90:8033-8037; and Boris-Lawrie and Temin, Cur. Opin. Genet Develop. (1993) 

10 3:102-109. Briefly, retroviral gene delivery vehicles of the present invention may be 
readily constructed from a wide variety of retroviruses, including for example, B, C, and 
D type retroviruses as well as spumaviruses and lentiviruses such as FIV, HIV, HIV-1, 
HIV-2 and SIV (see RNA Tumor Viruses, Second Edition, Cold Spring Harbor 
Laboratory, 1985). Such retroviruses may be readily obtained from depositories or 

15 collections such as the American Type Culture Collection ("ATCC"; 10801 University 
Blvd., Manassas, VA 201 10-2209), or isolated from known sources using commonly 
available techniques. 

A number of adenovirus vectors have also been described. Unlike retroviruses 
which integrate into the host genome, adenoviruses persist extrachromosomally thus 

20 minimizing the risks associated with insertional mutagenesis (Haj -Ahmad and Graham, J. 
Virol (1986) 57:267-274; Bett et al, 1 Virol (1993) 67:5911-5921; Mittereder et al, 
Human Gene Therapy (1994) 5:717-729; Seth et al, J. Virol (1994) 68:933-940; Barr et 
al., Gene Therapy (1994) 1:51-58; Berkner, K.L. BioTechniques (1988) 6:616-629; and 
Rich et al., Human Gene Therapy (1993) 4:461-476). 

25 Additionally, various adeno-associated virus (AAV) vector systems have been 

developed for gene delivery. AAV vectors can be readily constructed using techniques 
well known in the art. See, e.g., U.S. Patent Nos. 5,173,414 and 5,139,941; International 
Publication Nos. WO 92/01070 (published 23 January 1992) and WO 93/03769 
(published 4 March 1993); Lebkowski et al, Molec. Cell Biol (1988) 8:3988-3996; 

30 Vincent et al, Vaccines 90 (1990) (Cold Spring Harbor Laboratory Press); Carter, BJ. 
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Current Opinion in Biotechnology (1992) 3:533-539; Muzyczka, N. Current Topics in 

Microbiol and Immunol (1992) 158:97-129; Kotin, R.M. Human Gene Therapy (1994) 

5:793-801; Shelling and Smith, Gene Therapy (1994) 1:165-169; and Zhou et al., 1 Exp, 

Med. (1994) 179:1867-1875. 
5 Molecular conjugate vectors, such as the adenovirus chimeric vectors described in 

Michael et al, J. Biol Chem. (1993) 268:6866-6869 and Wagner et al., Proa Natl Acad. 

Set USA (1992) 89:6099-6103, can also be used for gene delivery. 

Members of the Alphavirus genus, such as but not limited to vectors derived from 

the Sindbis, Semliki Forest viruses and VEE, will also find use as viral vectors for 
10 delivering the gene of interest. For a description of Sinbus-virus derived vectors useful 

for the practice of the instant methods, see, Dubensky et al., J, Virol (1996) 70:508-519; 

and International Publication Nos. WO 95/07995 and WO 96/17072. 

The expression constructs of the present invention can also be delivered without a 

viral vector. For example, the construct can be delivered directly, or packaged in 
15 liposomes prior to delivery to the subject or to cells derived therefrom, as described 

above. 

The expression constructs may also be encapsulated, adsorbed to, or associated 
with, particulate carriers. Such carriers present multiple copies of a selected molecule to 
the immune system and promote trapping and retention of molecules in local lymph 

20 nodes. The particles can be phagocytosed by macrophages and can enhance antigen 
presentation through cytokine release. Examples of particulate carriers include those 
derived from polymethyl methacrylate polymers, as well as microparticles derived from 
poly(lactides) and poly(lactide-co-glycolides), known as PLG. See, e.g., Jeffery et al., 
Pharm. Res. (1993) 10:362-368; and McGee et al, I Microencap. (1996). 

25 Furthermore, other particulate systems and polymers can be used for the in vivo or 

ex vivo delivery of the expression constructs. For example, polymers such as polylysine, 
polyarginine, polyornithine, spermine, spermidine, as well as conjugates of these 
molecules, are useful for transferring a nucleic acid of interest. Similarly, DEAE dextran- 
mediated transfection, calcium phosphate precipitation or precipitation using other 

30 insoluble inorganic salts, such as strontium phosphate, aluminum silicates including 



31 



PP16095.002 
PATENT 



bentonite and kaolin, chromic oxide, magnesium silicate, talc, and the like, will find use 
with the present system. See, e.g., Feigner, P.L., Advanced Drug Delivery Reviews 
(1990) 5:163-187, for a review of delivery systems useful for gene transfer. 

Additionally, biolistic delivery systems employing particulate carriers such as 

5 gold and tungsten, are especially useful for delivering the expression constructs of the 
present invention. The particles are coated with the construct to be delivered and 
accelerated to high velocity, generally under a reduced atmosphere, using a gun powder 
discharge from a "gene gun." For a description of such techniques, and apparatuses 
useful therefore, see, e.g., U.S. Patent Nos. 4,945,050; 5,036,006; 5,100,792; 5,179,022; 

10 5,371,015; and 5,478,744. 

Deposits of Strains Useful in Practicing the Invention 

A deposit of biologically pure cultures of the following strains was made with the 
American Type Culture Collection, 10801 University Boulevard, Manassas, VA. The 

1 5 accession number indicated was assigned after successful viability testing, and the 
requisite fees were paid, made under the provisions of the Budapest Treaty on the 
International Recognition of the Deposit of Microorganisms for the Purpose of Patent 
Procedure and the Regulations thereunder (Budapest Treaty). This assures maintenance 
of viable cultures for a period of thirty (30) years from the date of deposit. The 

20 organisms will be made available by the ATCC under the terms of the Budapest Treaty, 
which assures permanent and unrestricted availability of the progeny to one determined 
by the U.S. Commissioner of Patents and Trademarks to be entitled thereto according to 
35 U.S.C. §122 and the Commissioner's rules pursuant thereto (including 37 C.F.R. §1.12 
with particular reference to 886 OG 638). Upon the granting of a patent, all restrictions 

25 on the availability to the public of the deposited cultures will be irrevocably removed. 

These deposits are provided merely as convenience to those of skill in the art, and 
are not an admission that a deposit is required under 35 U.S.C. §112. The nucleic acid 
sequences of these genes, as well as the amino acid sequences of the molecules encoded 
thereby, are incorporated herein by reference and are controlling in the event of any 
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conflict with the description herein. A license may be required to make, use, or sell the 
deposited materials, and no such license is hereby granted. 

5 Plasmid Deposit Date ATCC No. 

pCON3 September 27, 2000 PTA-2504 



10 in. Experimental 

Below are examples of specific embodiments for carrying out the present 
invention. The examples are offered for illustrative purposes only, and are not intended 
to limit the scope of the present invention in any way. 

Efforts have been made to ensure accuracy with respect to numbers used (e.g., 
15 amounts, temperatures, etc.), but some experimental error and deviation should, of 
course, be allowed for. 

Restriction and modifying enzymes, as well as other reagents for DNA 
manipulations were purchased from commercial sources, and used according to the 
manufacturers' directions. In the cloning of DNA fragments, except where noted, all 
20 DNA manipulations were done according to standard procedures. See, e.g., Sambrook et 
al., supra. 

Example 1 

Production of Expression Constructs Including Intron A Fragments 
25 A series of 13 expression constructs were made which deleted from between 54 

and 688 nucleotides from within the core region of Intron A, bounded by the splice donor 
and branch point sites. The expression constructs were linked to the firefly (Photinus 
pyralis) luciferase gene or to a codon-optimized HIV p55gag gene (Zur Megede et al, J. 
Virol (2000) 74:2628-2635.) 
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The initial deletion of Intron A was prepared by means of substituting a 778 base 
pair Nsil - Sail fragment from plasmid pCMVkmLuc (International Publication No. WO 
98/06437) with a synthetic oligonucleotide (Figure IB) that restored the last 80 
nucleotides of Intron A (with optimized branch point and polypyrimidine tract sequences 

5 as shown in Figure IB), together with Exon 2 of the 5'-UTR (nucleotides 821-834 of 
Figure 1 A). The resulting construct contained a 688 bp deletion from Intron A and is 
shown in Figure 1 A. The resulting expression plasmid, pCON3, contains the hCMV 
enhancer/promoter region with a 130 bp Intron A fragment. The final sequence of the 
intron in pCON3 is shown in Figure 1C. 

10 Twelve additional Intron A deletion constructs were made by progressive deletion 

within plasmid pCMVII (U.S. Patent No. 6,096,505) in either the 5' -3 ! direction from 
the unique Nsil site toward the unique Hpal site, or in the 3 f ->5 f direction from the Hpal 
site toward the Nsil site (see, Figure 3 and Table 1). Following the restriction enzyme 
digests, the plasmids were treated with T4 DNA polymerase and excess dNTPs. 

15 Resulting blunt-ended vector fragments were gel-purified and self-ligated. As shown in 
Figure 3, these constructs included deletions within the intron ranging from 54 to 663 
base pairs in length. To generate expression vectors carrying the resulting intron 
modifications, the Ndel-SaK fragment from the truncation plasmids was substituted into 
plasmid pCMVkmLuc digested with Ndel and Sail. Of these constructs, selected ones 

20 were digested with SaK-Xbal to generate recipient vector fragments for the insertion of 
the codon-optimized HIV p55gag gene obtained by digestion of plasmid 
pCMVkm2.GAGmod.SF2 (Zur Megede et al. ? 1 Virol (2000) 74:2628-2635). 
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Table 1 


Digest 


Deletion 
Length (bp) 


NT. deleted from Intron A (following 
digest, blunting, religation) 


Nsil-Cem 


70 


47-116 


Nsil-Xcml 


113 


47-159 


Nsil-Pflml 


150 


47-196 


Nsil-Mrol 


345 


47-391 


Nsil-Bfrl 


578 


47-624 


Nsil-Pvull 


609 


47-655 


Nsil-Hpal 


663 


47-709 








Hpal-PvuR 


54 


656-709 


Hpal-Brfi 


80 


630-709 


Hpal-Mroll 


314 


395-709 


Hpal-Pflml 


516 


193-709 


Hpal-CeM 


590 


119-709 



Example 2 

Expression of a Heterologous Coding Sequence Using Intron A Fragments 
293 (ATTC Accession No. CRL-1573) and RD (ATTC Accession No. CCL-136) 
cells were grown in DMEM medium supplemented with fetal calf serum (10% v/v). 
Fourteen hours prior to transfection, 2xl0 5 cells/well were seeded into 6 well plates. 
Transient transfection was done using 2 jutg of the vector DNA described above, per well 
using 12 jag of Fugene 6 (Roche Molecular Biochemicals, Indianapolis, IN) per supplier 
instructions in 6 replicate wells per construct. Forty-eight hours post-transfection, cell 
ly sates were analyzed for reporter gene expression. HIV p55gag expression was 
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evaluated by means of a p24 antigen ELISA (Coulter, Miami, FL). Geometric mean 
titers across each plate (construct) were calculated. 

Transient transfection of 293 cells and evaluation of luciferase expression 
indicated that nearly all of these derivatives expressed as well as or better than the parent 
vector, pCMVkm-Luciferase containing the full-length Intron A sequence. The 
constructs containing the two largest intron deletions (pCon3, AHpal-CeKT) showed the 
greatest enhancement, approximately two-fold greater than the parent vector (Figure 4). 

To further evaluate the effect on expression of a smaller intron, the entire 
sequence of Intron A was substituted with the 126 base pair Intron I from the rabbit (3- 
globin gene (RpG-IVSI). Figure 5 A shows the wild-type rabbit p-globin gene sequence 
used. In vitro analysis of p55gag expression indicated that the wild-type construct 
expressed up to approximately 1.8 times higher than the parent vector, pCMVkm- 
Luciferase (Figure 6) . The wild-type sequences for the splice donor, branch point and 
polyY tract of RpG-IVSI are suboptimal relative to the consensus sequences for these 
elements. Therefore, the construct containing RPG-IVSI was modified such that these 
sequence elements were optimized. Figure 5B shows the optimized rabbit p-globin gene 
sequence used, termed RpG-OPTL Analysis of this construct showed approximately 4 
times higher p55gag expression as compared to the parent vector in vitro (Figure 6). 

All 14 modified-intron constructs were analyzed for efficiency of RNA transcript 
splicing by RT-PCR. For RNA transcript analysis, 293 cells were transiently transfected 
and then lysed using RNAstat 60 (Tel-Test B, Inc., Friendswood, TX) to yield total cell 
RNA. Extracted RNA was digested with RQl-Dnase (Promega Corp, Madison, WI) and 
subjected to RT-PCR using the GeneAmp RNA PCR kit (Roche Molecular 
Biochemicals, Indianapolis, IN). PCR spanning the region of the intron was done using 
an upstream primer in exon 1 of the 5 1 UTR [primer "KBT-162"; seq. 
CGCTGTTTTGACCTCCATA (SEQ ID NO: 7)] and a downstream primer from the 
luciferase reporter gene [primer "KBT-163"; seq. GTTGAGCAATTCACGTTCAT (SEQ 
ID NO: 8)]; a control PCR of actin transcripts was also performed for each RNA 
preparation. All of the mutants spliced efficiently, within the sensitivity of the assay, as 
no products of lengths predicted for unspliced messages were detected. 
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Example 3 

Nucleic Acid Immunization Using the Intron A Fragment 
In order to test the ability of the Intron A fragments to direct transcription in vivo, 
5 Balb/C mice in groups of 6 animals (Charles River Co., Willmington, MA) were 

immunized once bilaterally in the tibialis anterior muscle with 5 jig of naked vector DNA 
per injection site (prepared endotoxin-free [Qiagen, Inc., Valencia, CA] and formulated in 
normal saline). Three- and six-week post-immunization bleeds were analyzed by ELISA 
for anti-HIV p55gag antibody as described in Zur Megede et al., J. Virol (2000) 74:2628- 
10 2635. 

The constructs evaluated are shown in Figure 7. Variable immunogenicites were 
seen after a single immunization (see, Figure 7). Significantly, the pCON3 vector which 
deleted approximately 85% of Intron A yielded higher geometric mean titers than the 
parent pCMVkm2.GAGmod.SF2 vector (Figure 7), At three weeks post-immunization, 

15 the titer was approximately twice that of the parent vector though this fell off by six 
weeks post-injection. 

Accordingly, novel hCMV Intron A fragments and methods of using the same 
have been disclosed. From the foregoing, it will be appreciated that, although specific 
embodiments of the invention have been described herein for purposes of illustration, 

20 various modifications may be made without deviating from the spirit and scope of the 
appended claims. 
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